A Comparison of Approaches to Word Class Tagging: Disjunctively vs. Conjunctively Written Bantu Languages*

نویسندگان

  • ELSABÉ TALJARD
  • SONJA E. BOSCH
چکیده

Northern Sotho and Zulu are two South African Bantu languages that make use of different writing systems, viz. a disjunctive and a conjunctive writing system respectively. In this article it is argued that the different orthographic systems obscure the morphological similarities and that these systems impact directly on word class tagging for the two languages. It is illustrated that not only different approaches are needed for word class tagging, but also that the sequencing of tasks is to a large extent determined by the difference in writing systems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Setswana Tokenisation and Computational Verb Morphology: Facing the Challenge of a Disjunctive Orthography

Prefixes of the Setswana verb • The subject agreement morphemes, written disjunctively, include non-consecutive subject agreement morphemes and consecutive subject agreement morphemes. For example, the non-consecutive subject agreement morpheme for class 5 is le as in lekau le a tshega (the young man is laughing), while the consecutive subject agreement morpheme for class 5 is la as in lekau la...

متن کامل

User-friendly Dictionaries for Zulu: An Exercise in Complexicography

In this paper the main features of Bantu lexicography are analysed through several case studies of Zulu dictionary features. Examples from both existing dictionaries as well as a forthcoming reference work are used in the analysis, which develops from verbs and nouns, gradually including more word classes, and ending with a detailed study of possessive pronouns. The latter serves as one example...

متن کامل

Finite state tokenisation of an orthographical disjunctive agglutinative language: The verbal segment of Northern Sotho

Tokenisation is an important first pre-processing step required to adequately test finite-state morphological analysers. In agglutinative languages each morpheme is concatinatively added on to form a complete morphological structure. Disjunctive agglutinative languages like Northern Sotho write these morphemes, for certain morphological categories only, as separate words separated by spaces or ...

متن کامل

Kappa-Join: Efficient Execution of Existential Quantification in XML Query Languages

XML query languages feature powerful primitives for formulating queries involving comparison expressions which are existentially quantified. If such comparisons involve several scopes, they are correlated, and become difficult to evaluate efficiently. In this paper, we develop a new ternary operator, called Kappa-Join, for efficiently evaluating queries with existential quantification. In XML q...

متن کامل

The Consequences of the Contacts between Bantu and Non-Bantu Languages around Lake Eyasi in Northern Tanzania

In rural Tanzania, recent major influences happen between Kiswahili and English to ethnic languages rather than ethnic languages, which had been in contact for so long, influencing each other. In this work, I report the results of investigation of lexical changes in indigenous languages that aimed at examining how ethnic communities and their languages, namely Cushitic Iraqw, Nilotic Datooga, N...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006